Probabilistic-trajectory segmental HMMs
نویسندگان
چکیده
“Segmental hidden Markov models” (SHMMs) are intended to overcome important speech-modelling limitations of the conventional-HMM approach by representing sequences (or segments) of features and incorporating the concept of trajectories to describe how features change over time. A novel feature of the approach presented in this paper is that extra-segmental variability between different examples of a sub-phonemic speech segment is modelled separately from intra-segmental variability within any one example. The extra-segmental component of the model is represented in terms of variability in the trajectory parameters, and these models are therefore referred to as “probabilistic-trajectory segmental HMMs” (PTSHMMs). This paper presents the theory of PTSHMMs using a linear trajectory description characterized by slope and mid-point parameters, and presents theoretical and experimental comparisons between different types of PTSHMMs, simpler SHMMs and conventional HMMs. Experiments have demonstrated that, for any given feature set, a linear PTSHMM can substantially reduce the error rate in comparison with a conventional HMM, both for a connected-digit recognition task and for a phonetic classification task. Performance benefits have been demonstrated from incorporating a linear trajectory description and additionally from modelling variability in the mid-point parameter. c © 1999 British Crown Copyright/DERA
منابع مشابه
Linear dynamic segmental HMMs: variability representation and training procedure
This paper describes investigations into the use of linear dynamic segmental hidden Markov models (SHMMs) for modelling speech feature-vector trajectories and their associated variability. These models use linear trajectories to describe how features change over time, and distinguish between extrasegmental variability of different trajectories and intrasegmental variability of individual observ...
متن کاملDetection of social speech signals using adaptation of segmental HMMs
This paper proposes an approach to detect social speech signals by computing segmental features using adaptation of segmental Hidden Markov Models (HMMs). This approach uses segmental HMMs and model adaptation techniques such as Maximum Likelihood Linear Regression (MLLR) and Maximum A Posterior (MAP) in order to acquire specific (or adapted) segmental HMMs that are fine-tuned to detect local r...
متن کاملSpeech recognition using non-linear trajectories in a formant-based articulatory layer of a multiple-level segmental HMM
This paper describes how non-linear formant trajectories, based on ‘trajectory HMM’ proposed by Tokuda et al., can be exploited under the framework of multiple-level segmental HMMs. In the resultant model, named a non-linear/linear multiple-level segmental HMM, speech dynamics are modeled as non-linear smooth trajectories in the formant-based intermediate layer. These formant trajectories are m...
متن کاملEXPERIMENTAL EVALUATION OF SEGMENTAL HMMS - Acoustics, Speech, and Signal Processing, 1995. ICASSP-95., 1995 International Conference on
The aim of the research described in this paper is to overcome important speech-modeling limitations of conventional hidden Markov models (HMMs), by developing a dynamic segmental HMM which models the changing pattern of speech over the duration of some phoneme-type unit. As a first step towards this goal, a static segmental HMM [3] has been implemented and tested, This model reduces the influe...
متن کاملStart- and end-node segmental-HMM pruning
An efficient decoding algorithm for segmental HMMs (SHMMs) is proposed with multi-stage pruning. The generation by SHMMs of a feature trajectory for each state expands the search space and the computational cost of decoding. It is reduced in three ways: pre-cost partitioning, start-node (SN) beam pruning, and conventional endnode (EN) beam pruning. Experiments show that partitioning cuts comput...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Computer Speech & Language
دوره 13 شماره
صفحات -
تاریخ انتشار 1999